A r t i c l e s
Navigation

Note: This site is
a bit older, personal views
may have changed.

M a i n P a g e

D i r e c t o r y

Capstring Is A String That Grows In Chunks


In about year 2004-2005 or so, I had ideas about a string that was better than a Pascal ansistring in many ways. Better for speed and every day concatenations, especially in loops since programmers often use loops. Loops are probably the most prevalent places for bottlenecks to occur.

Benchmark from Year 2007:

Above benchmark displays evidence that ansistrings can take minutes or sometimes hours to concatenate in big loops, while a capstring can take less than one second for the same concatenation.

Concatenations are one area where speed really does matter, especially in loops. Concatenations are slow when we do this with ansistrings:

  for.. i... 
  begin 
    S + S + S.. 
  end; 
The so called solution that many people suggest is to use pchars instead (*char), or use ansistring and call setlength() and uniquestring() ahead of time in modern pascal.. but that's annoying, and tedious. There is a better way.

With a Capstring, you set a growth capacity. The string can grow in chunks instead of memory being allocated in piddly small amounts with each concatenation, like an ansistring or pchar.

With an ansistring, using setlength() and accessing the string(i) using char index takes a lot of work and is prone to error (out of bounds). But with a capstring you just concatenate away like normal, but using an ADD function (addstr). In the future when compilers support capstring, no addstr() call will be needed as it will be a built in type

Side note:

Dear compiler writers, please consider building this type into your compilers.. delphi, fpc, ada, java, and c++ compiler writers especially.

If it can't be implemented into a compiler immediately (since compilers generally take a long time for features to be implemented) then another idea for implementing capstring temporarily would be to use operator overloaders or pre processor macro tricks. I hate those though as that is C++ish.

Once a capstring type was built into a compiler it would work just like ansistrings:

  capstr + capstr + capstr 
None of this below (C way of doing it):
 concat(capstr1, capstr2)
However, just to show you what a capstring is you can see it on SVN below.

Download Capstring for FPC/Delphi

This download is not for a built in type for fpc/delphi, it is just a record/struct implementation that is simple and easy to use - but not as easy as using a built in type like an ansistring. It is still much easier than using a pchar or a manually hacked ansistring (such as when you call setlength or uniquestring for performance and access it char by char)

I looked around for Capstring a few months/years ago and didn't see anyone that had thought of this.. I don't think all alogorithms are therefore invented yet (of course not, but some people are stuck in their ways and don't beleive in new inventions). I've heard Java has some sort of string that you can adjust but I don't know if it acts like a capacitor which can grow capacity in chunks.

Capacity types that grow in chunks are not just useful for strings, but for buffers, lists, arrays, and other types too.

If I was an ASSHOLE like IBM or MICROSOFT, then I would PATENT THIS and charge compiler writers money every time they used the Capstring in their code. But since I'm not an ASSHOLE, I'm offering it to the public.

Will Freepascal use a Capstr built in type?

I especially encourage FPC development team to use this idea in their next latest compiler version (3.0.0?).

Portability of Capstring

I've always hated the fact that ansistrings in Delphi/FPC aren't compatible with C/C++ because programmers are so arrogant not to get together and make languages compatible. But we can implement the Capstring in C++/C and have it compatible with modern pascal programs by using a struct or class to hold the capstring. It should however eventually be implemented into the compiler as a built in automated type, in hopefuly both C++ and modern Pascal.. i.e. just like an ansistring or dynamic array is a built in dynamic type with syntactic sugar.

Why not just ship it as an object/class?

The capstring could just be an object like a TStringList but that requires Free/Create and it requires line noise. Add(), Free(), and Create() causes programs to be bloated up with irrelevant code.

An ansistring is just syntax sugar for a charlist if you think about it ( a list of characters without having to Free/Create it).. so a capstring would be syntax sugar for a charlist with capacity.

EndUpdate is For What?

This is a bit like BeginUpdate and EndUpdate. The EndUpdate call should be available to the programmer using the capstring so that the programmer can choose when to close off the string for reading (otherwise the string has extra empty memory wasted due to capacitance).

About
This site is about programming and other things.
_ _ _